life science
Open and Sustainable AI: challenges, opportunities and the road ahead in the life sciences (October 2025 -- Version 2)
Farrell, Gavin, Adamidi, Eleni, Buono, Rafael Andrade, Anton, Mihail, Attafi, Omar Abdelghani, Gutierrez, Salvador Capella, Capriotti, Emidio, Castro, Leyla Jael, Cirillo, Davide, Crossman, Lisa, Dessimoz, Christophe, Dimopoulos, Alexandros, Fernandez-Diaz, Raul, Fragkouli, Styliani-Christina, Goble, Carole, Gu, Wei, Hancock, John M., Khanteymoori, Alireza, Lenaerts, Tom, Liberante, Fabio G., Maccallum, Peter, Monzon, Alexander Miguel, Palmblad, Magnus, Poveda, Lucy, Radulescu, Ovidiu, Shields, Denis C., Sufi, Shoaib, Vergoulis, Thanasis, Psomopoulos, Fotis, Tosatto, Silvio C. E.
Artificial intelligence (AI) has recently seen transformative breakthroughs in the life sciences, expanding possibilities for researchers to interpret biological information at an unprecedented capacity, with novel applications and advances being made almost daily. In order to maximise return on the growing investments in AI-based life science research and accelerate this progress, it has become urgent to address the exacerbation of long-standing research challenges arising from the rapid adoption of AI methods. We review the increased erosion of trust in AI research outputs, driven by the issues of poor reusability and reproducibility, and highlight their consequent impact on environmental sustainability. Furthermore, we discuss the fragmented components of the AI ecosystem and lack of guiding pathways to best support Open and Sustainable AI (OSAI) model development. In response, this perspective introduces a practical set of OSAI recommendations directly mapped to over 300 components of the AI ecosystem and provides guiding implementation pathways. Our work connects researchers with relevant AI resources, facilitating the implementation of sustainable, reusable and reproducible AI. Built upon life science community consensus and aligned to existing efforts, the outputs of this perspective are designed to aid the future development of policy and additional structured pathways for guiding AI implementation.
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- (16 more...)
- Overview (1.00)
- Research Report > Experimental Study (0.68)
- Information Technology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Energy (1.00)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Four Shades of Life Sciences: A Dataset for Disinformation Detection in the Life Sciences
Seidlmayer, Eva, Galke, Lukas, Förstner, Konrad U.
Disseminators of disinformation often seek to attract attention or evoke emotions - typically to gain influence or generate revenue - resulting in distinctive rhetorical patterns that can be exploited by machine learning models. In this study, we explore linguistic and rhetorical features as proxies for distinguishing disinformative texts from other health and life-science text genres, applying both large language models and classical machine learning classifiers. Given the limitations of existing datasets, which mainly focus on fact checking misinformation, we introduce Four Shades of Life Sciences (FSoLS): a novel, labeled corpus of 2,603 texts on 14 life-science topics, retrieved from 17 diverse sources and classified into four categories of life science publications. The source code for replicating, and updating the dataset is available on GitHub: https://github.com/EvaSeidlmayer/FourShadesofLifeSciences
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (10 more...)
- Media > News (1.00)
- Health & Medicine > Therapeutic Area > Vaccines (1.00)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- (2 more...)
SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models
Qin, Chuan, Chen, Xin, Wang, Chengrui, Wu, Pengmin, Chen, Xi, Cheng, Yihang, Zhao, Jingyi, Xiao, Meng, Dong, Xiangchao, Long, Qingqing, Pan, Boya, Wu, Han, Li, Chengzan, Zhou, Yuanchun, Xiong, Hui, Zhu, Hengshu
In recent years, the rapid advancement of Artificial Intelligence (AI) technologies, particularly Large Language Models (LLMs), has revolutionized the paradigm of scientific discovery, establishing AI-for-Science (AI4Science) as a dynamic and evolving field. However, there is still a lack of an effective framework for the overall assessment of AI4Science, particularly from a holistic perspective on data quality and model capability. Therefore, in this study, we propose SciHorizon, a comprehensive assessment framework designed to benchmark the readiness of AI4Science from both scientific data and LLM perspectives. First, we introduce a generalizable framework for assessing AI-ready scientific data, encompassing four key dimensions: Quality, FAIRness, Explainability, and Compliance which are subdivided into 15 sub-dimensions. Drawing on data resource papers published between 2018 and 2023 in peer-reviewed journals, we present recommendation lists of AI-ready datasets for both Earth and Life Sciences, making a novel and original contribution to the field. Concurrently, to assess the capabilities of LLMs across multiple scientific disciplines, we establish 16 assessment dimensions based on five core indicators Knowledge, Understanding, Reasoning, Multimodality, and Values spanning Mathematics, Physics, Chemistry, Life Sciences, and Earth and Space Sciences. Using the developed benchmark datasets, we have conducted a comprehensive evaluation of over 20 representative open-source and closed source LLMs. All the results are publicly available and can be accessed online at www.scihorizon.cn/en.
- Asia > Thailand (0.14)
- North America > Mexico > Mexico City (0.14)
- Asia > China > Tibet Autonomous Region (0.14)
- (2 more...)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
Users Favor LLM-Generated Content -- Until They Know It's AI
Parshakov, Petr, Naidenova, Iuliia, Paklina, Sofia, Matkin, Nikita, Nesseler, Cornel
In this paper, we investigate how individuals evaluate human and large langue models generated responses to popular questions when the source of the content is either concealed or disclosed. Through a controlled field experiment, participants were presented with a set of questions, each accompanied by a response generated by either a human or an AI. In a randomized design, half of the participants were informed of the response's origin while the other half remained unaware. Our findings indicate that, overall, participants tend to prefer AI-generated responses. However, when the AI origin is revealed, this preference diminishes significantly, suggesting that evaluative judgments are influenced by the disclosure of the response's provenance rather than solely by its quality. These results underscore a bias against AI-generated content, highlighting the societal challenge of improving the perception of AI work in contexts where quality assessments should be paramount.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Chasing AI's value in life sciences
Given rising competition, higher customer expectations, and growing regulatory challenges, these investments are crucial. But to maximize their value, leaders must carefully consider how to balance the key factors of scope, scale, speed, and human-AI collaboration. The common refrain from data leaders across all industries--but specifically from those within data-rich life sciences organizations--is "I have vast amounts of data all over my organization, but the people who need it can't find it." And in a complex healthcare ecosystem, data can come from multiple sources including hospitals, pharmacies, insurers, and patients. "Addressing this challenge," says Sheeran, "means applying metadata to all existing data and then creating tools to find it, mimicking the ease of a search engine. Until generative AI came along, though, creating that metadata was extremely time consuming."
AutoDSL: Automated domain-specific language design for structural representation of procedures with constraints
Shi, Yu-Zhe, Hou, Haofei, Bi, Zhangqian, Meng, Fanxu, Wei, Xiang, Ruan, Lecheng, Wang, Qining
Accurate representation of procedures in restricted scenarios, such as non-standardized scientific experiments, requires precise depiction of constraints. Unfortunately, Domain-specific Language (DSL), as an effective tool to express constraints structurally, often requires case-by-case hand-crafting, necessitating customized, labor-intensive efforts. To overcome this challenge, we introduce the AutoDSL framework to automate DSL-based constraint design across various domains. Utilizing domain specified experimental protocol corpora, AutoDSL optimizes syntactic constraints and abstracts semantic constraints. Quantitative and qualitative analyses of the DSLs designed by AutoDSL across five distinct domains highlight its potential as an auxiliary module for language models, aiming to improve procedural planning and execution.
- North America > United States > Missouri > St. Louis County > St. Louis (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Law (0.92)
- Health & Medicine > Therapeutic Area > Hematology (0.46)
- Materials > Chemicals > Commodity Chemicals (0.46)
Computing in the Life Sciences: From Early Algorithms to Modern AI
Donkor, Samuel A., Walsh, Matthew E., Titus, Alexander J.
Computing in the life sciences has undergone a transformative evolution, from early computational models in the 1950s to the applications of arti cial intelligence (AI) and machine learning (ML) seen today. This paper highlights key milestones and technological advancements through the historical development of computing in the life sciences. The discussion includes the inception of computational models for biological processes, the advent of bioinformatics tools, and the integration of AI/ML in modern life sciences research. Attention is given to AI-enabled tools used in the life sciences, such as scienti c large language models and bio-AI tools, examining their capabilities, limitations, and impact to biological risk. This paper seeks to clarify and establish essential terminology and concepts to ensure informed decision-making and e ective communication across disciplines. The views and opinions expressed within this manuscript are those of the authors and do not necessarily re ect the views and opinions of any organization the authors are a liated with.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (6 more...)
VERA: Generating Visual Explanations of Two-Dimensional Embeddings via Region Annotation
Poličar, Pavlin G., Zupan, Blaž
Two-dimensional embeddings obtained from dimensionality reduction techniques, such as MDS, t-SNE, and UMAP, are widely used across various disciplines to visualize high-dimensional data. These visualizations provide a valuable tool for exploratory data analysis, allowing researchers to visually identify clusters, outliers, and other interesting patterns in the data. However, interpreting the resulting visualizations can be challenging, as it often requires additional manual inspection to understand the differences between data points in different regions of the embedding space. To address this issue, we propose Visual Explanations via Region Annotation (VERA), an automatic embedding-annotation approach that generates visual explanations for any two-dimensional embedding. VERA produces informative explanations that characterize distinct regions in the embedding space, allowing users to gain an overview of the embedding landscape at a glance. Unlike most existing approaches, which typically require some degree of manual user intervention, VERA produces static explanations, automatically identifying and selecting the most informative visual explanations to show to the user. We illustrate the usage of VERA on a real-world data set and validate the utility of our approach with a comparative user study. Our results demonstrate that the explanations generated by VERA are as useful as fully-fledged interactive tools on typical exploratory data analysis tasks but require significantly less time and effort from the user.
- Europe > Germany (0.05)
- Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
- Europe > Portugal > Coimbra > Coimbra (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Knowledge Graphs for the Life Sciences: Recent Developments, Challenges and Opportunities
Chen, Jiaoyan, Dong, Hang, Hastings, Janna, Jiménez-Ruiz, Ernesto, López, Vanessa, Monnin, Pierre, Pesquita, Catia, Škoda, Petr, Tamma, Valentina
The term life sciences refers to the disciplines that study living organisms and life processes, and include chemistry, biology, medicine, and a range of other related disciplines. Research efforts in life sciences are heavily data-driven, as they produce and consume vast amounts of scientific data, much of which is intrinsically relational and graph-structured. The volume of data and the complexity of scientific concepts and relations referred to therein promote the application of advanced knowledge-driven technologies for managing and interpreting data, with the ultimate aim to advance scientific discovery. In this survey and position paper, we discuss recent developments and advances in the use of graph-based technologies in life sciences and set out a vision for how these technologies will impact these fields into the future. We focus on three broad topics: the construction and management of Knowledge Graphs (KGs), the use of KGs and associated technologies in the discovery of new knowledge, and the use of KGs in artificial intelligence applications to support explanations (explainable AI). We select a few exemplary use cases for each topic, discuss the challenges and open research questions within these topics, and conclude with a perspective and outlook that summarizes the overarching challenges and their potential solutions as a guide for future research.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (24 more...)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Health Care Technology (1.00)
- Education (1.00)
Mapping the micro and macro of biology with spatial omics and AI
Spatial omics refers to the ability to measure the activity of biomolecules (RNA, DNA, proteins, and other omics) in situ--directly from tissue samples. This is important because many biological processes are controlled by highly localized interactions between cells that take place in spatially heterogeneous tissue environments. Spatial omics allows previously unobservable cellular organization and biological events to be viewed in unprecedented detail. A few years ago, these technologies were just prototypes in a handful of labs around the world. They worked only on frozen tissue and they required impractically large amounts of precious tissue biopsies.